Exploring the plant transcriptome through phylogenetic profiling.
نویسندگان
چکیده
Publicly available protein sequences represent only a small fraction of the full catalog of genes encoded by the genomes of different plants, such as green algae, mosses, gymnosperms, and angiosperms. By contrast, an enormous amount of expressed sequence tags (ESTs) exists for a wide variety of plant species, representing a substantial part of all transcribed plant genes. Integrating protein and EST sequences in comparative and evolutionary analyses is not straightforward because of the heterogeneous nature of both types of sequence data. By combining information from publicly available EST and protein sequences for 32 different plant species, we identified more than 250,000 plant proteins organized in more than 12,000 gene families. Approximately 60% of the proteins are absent from current sequence databases but provide important new information about plant gene families. Analysis of the distribution of gene families over different plant species through phylogenetic profiling reveals interesting insights into plant gene evolution, and identifies species- and lineage-specific gene families, orphan genes, and conserved core genes across the green plant lineage. We counted a similar number of approximately 9,500 gene families in monocotyledonous and eudicotyledonous plants and found strong evidence for the existence of at least 33,700 genes in rice (Oryza sativa). Interestingly, the larger number of genes in rice compared to Arabidopsis (Arabidopsis thaliana) can partially be explained by a larger amount of species-specific single-copy genes and species-specific gene families. In addition, a majority of large gene families, typically containing more than 50 genes, are bigger in rice than Arabidopsis, whereas the opposite seems true for small gene families.
منابع مشابه
Sequence Analysis and Phylogenetic Profiling of the Nonstructural (NS) Genes of H9N2 Influenza A Viruses Isolated in Iran during 1998-2007
The earliest evidences on circulation of Avian Influenza (AI) virus on the Iranian poultry farms date back to 1998. Great economic losses through dramatic drop in egg production and high mortality rates are characteristically attributed to H9N2 AI virus. In the present work non-structural (NS) genes of 10 Iranian H9N2 chicken AI viruses collected during 1998-2007 were fully sequenced and subjec...
متن کاملProtein profiling for phylogenetic relationship in snakehead species
Protein banding pattern of eight snakeheads – Channa species viz., Channa striatus, Channa marulius, Channa punctatus, Channa diplogramme, Channa bleheri, Channa gachua, Channa stewartii and Channa aurantimaculata collected from different regions of India were used to study the phylogenetic relationship among them. The banding pattern from muscle protein indicated a unique profile for each spec...
متن کاملProtein profiling for phylogenetic relationship in snakehead species
Protein banding pattern of eight snakeheads – Channa species viz., Channa striatus, Channa marulius, Channa punctatus, Channa diplogramme, Channa bleheri, Channa gachua, Channa stewartii and Channa aurantimaculata collected from different regions of India were used to study the phylogenetic relationship among them. The banding pattern from muscle protein indicated a unique profile for each spec...
متن کاملComparative Genomics in Perennial Ryegrass (Lolium perenne L.): Identification and Characterisation of an Orthologue for the Rice Plant Architecture-Controlling Gene OsABCG5
Perennial ryegrass is an important pasture grass in temperate regions. As a forage biomass-generating species, plant architecture-related characters provide key objectives for breeding improvement. In silico comparative genomics analysis predicted colocation between a previously identified QTL for plant type (erect versus prostrate growth) and the ortholocus of the rice OsABCG5 gene (LpABCG5), ...
متن کاملShedding light on an extremophile lifestyle through transcriptomics.
The tropical intertidal ecosystem is defined by trees - mangroves - which are adapted to an extreme and extremely variable environment. The genetic basis underlying these adaptations is, however, virtually unknown. Based on advances in pyrosequencing, we present here the first transcriptome analysis for plants for which no prior genomic information was available. We selected the mangroves Rhizo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Plant physiology
دوره 137 1 شماره
صفحات -
تاریخ انتشار 2005